翻訳と辞書
Words near each other
・ Paralycoptera
・ Parallel slave port
・ Parallel slowdown
・ Parallel society
・ Parallel Sons
・ Parallel state
・ Parallel Stories
・ Parallel striped mbuna
・ Parallel structures
・ Parallel study
・ Parallel syntax
・ Parallel tempering
・ Parallel terraced scan
・ Parallel text
・ Parallel thinking
Parallel Thread Execution
・ Parallel trading in Hong Kong
・ Parallel translation
・ Parallel transport
・ Parallel Trips
・ Parallel turn
・ Parallel Uni-Verses
・ Parallel universe
・ Parallel Universe (4hero album)
・ Parallel Universe (company)
・ Parallel universe (fiction)
・ Parallel Universe (Garnet Crow album)
・ Parallel Universe (Red Dwarf)
・ Parallel Universe (song)
・ Parallel Universes (film)


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Parallel Thread Execution : ウィキペディア英語版
Parallel Thread Execution
Parallel Thread Execution (PTX) is a pseudo-assembly language used in Nvidia's CUDA programming environment. The nvcc compiler translates code written in CUDA, a C-like language, into PTX, and the graphics driver contains a compiler which translates the PTX into a binary code which can be run on the processing cores.
== Registers ==
PTX uses an arbitrarily large register set; the output from the compiler is almost pure single-assignment form, with consecutive lines generally referring to consecutive registers. Programs start with declarations of the form

.reg .u32 %r<335>; // declare 335 registers %r0, %r1, ..., %r334 of type unsigned 32-bit integer

It is a three-argument assembly language, and almost all instructions explicitly list the data type (in terms of sign and width) on which they operate. Register names are preceded with a % character and constants are literal, e.g.:

shr.u64 %rd14, %rd12, 32; // shift right an unsigned 64-bit integer from %rd12 by 32 positions, result in %rd14
cvt.u64.u32 %rd142, %r112; // convert an unsigned 32-bit integer to 64-bit

There are predicate registers, but compiled code in shader model 1.0 uses these only in conjunction with branch commands; the conditional branch is

@%p14 bra $label; // branch to $label

The setp.cc.type instruction sets a predicate register to the result of comparing two registers of appropriate type, there is also a set instruction, where set.le.u32.u64 %r101, %rd12, %rd28 sets the 32-bit register %r101 to 0xffffffff if the 64-bit register %rd12 is less than or equal to the 64-bit register %rd28. Otherwise %r101 is set to 0x00000000.
There are a few predefined identifiers that denote pseudoregisters. Among others, %tid, %ntid, %ctaid, and %nctaid contain, respectively, thread indices, block dimensions, block indices, and grid dimensions.〔(PTX ISA Version 2.3 )〕

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Parallel Thread Execution」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.